Multimodal learning

Multimodal learning, in the context of machine learning, is a type of deep learning using a combination of various modalities of data, such as text, audio, or images, in order to create a more robust model of the real-world phenomena in question. In contrast, singular modal learning would analyze text (typically represented as feature vector) or imaging data (consisting of pixel intensities and annotation tags) independently. Multimodal machine learning combines these fundamentally different statistical analyses using specialized modeling strategies and algorithms, resulting in a model that comes closer to representing the real world.


© MMXXIII Rich X Search. We shall prevail. All rights reserved. Rich X Search